Linear Programming for Large-Scale Markov Decision Problems
نویسندگان
چکیده
We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for large scale problems, we pursue the more modest goal of competing with a low-dimensional family of policies. We use the dual linear programming formulation of the MDP average cost problem, in which the variable is a stationary distribution over state-action pairs, and we consider a neighborhood of a low-dimensional subset of the set of stationary distributions (defined in terms of state-action features) as the comparison class. We propose a technique based on stochastic convex optimization and give bounds that show that the performance of our algorithm approaches the best achievable by any policy in the comparison class. Most importantly, this result depends on the size of the comparison class, but not on the size of the state space. Preliminary experiments show the effectiveness of the proposed algorithm in a queuing application.
منابع مشابه
A New Compromise Decision-making Model based on TOPSIS and VIKOR for Solving Multi-objective Large-scale Programming Problems with a Block Angular Structure under Uncertainty
This paper proposes a compromise model, based on a new method, to solve the multi-objective large-scale linear programming (MOLSLP) problems with block angular structure involving fuzzy parameters. The problem involves fuzzy parameters in the objective functions and constraints. In this compromise programming method, two concepts are considered simultaneously. First of them is that the optimal ...
متن کاملA Compromise Decision-Making Model Based on TOPSIS and VIKOR for Multi-Objective Large- Scale Nonlinear Programming Problems with A Block Angular Structure under Fuzzy Environment
This paper proposes a compromise model, based on a new method, to solve the multiobjectivelarge scale linear programming (MOLSLP) problems with block angular structureinvolving fuzzy parameters. The problem involves fuzzy parameters in the objectivefunctions and constraints. In this compromise programming method, two concepts areconsidered simultaneously. First of them is that the optimal alter...
متن کاملA Compromise Decision-making Model for Multi-objective Large-scale Programming Problems with a Block Angular Structure under Uncertainty
This paper proposes a compromise model, based on the technique for order preference through similarity ideal solution (TOPSIS) methodology, to solve the multi-objective large-scale linear programming (MOLSLP) problems with block angular structure involving fuzzy parameters. The problem involves fuzzy parameters in the objective functions and constraints. This compromise programming method is ba...
متن کاملA Non-linear Integer Bi-level Programming Model for Competitive Facility Location of Distribution Centers
The facility location problem is a strategic decision-making for a supply chain, which determines the profitability and sustainability of its components. This paper deals with a scenario where two supply chains, consisting of a producer, a number of distribution centers and several retailers provided with similar products, compete to maintain their market shares by opening new distribution cent...
متن کاملEfficient Linear Approximations to Stochastic Vehicular Collision-Avoidance Problems
The key components of an intelligent vehicular collision-avoidance system are sensing, evaluation, and decision making. We focus on the latter task of finding (approximately) optimal collision-avoidance control policies, a problem naturally modeled as a Markov decision process. However, standard MDP models scale exponentially with the number of state features, rendering them inept for large-sca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014